Detecting Cellular Fraud Using Adaptive Prototypes
نویسندگان
چکیده
This paper discusses the current status of research on fraud detection undertaken as part of the European Commissionfunded ACTS ASPeCT (Advanced Security for Personal Communications Technologies) project, by Royal Holloway University of London. Using a recurrent neural network technique, we uniformly distribute prototypes over Toll Tickets, sampled from the U.K. network operator, Vodafone. The prototypes, which continue to adapt to cater for seasonal or long term trends, are used to classify incoming Toll Tickets to form statistical behaviour profiles covering both the short and long-term past. These behaviour profiles, maintained as probability distributions, comprise the input to a differential analysis utilising a measure known as the Hellinger distance[5] between them as an alarm criteria. Fine tuning the system to minimise the number of false alarms poses a significant task due to the low fraudulent/non fraudulent activity ratio. We benefit from using unsupervised learning in that no fraudulent examples are required for training. This is very relevant considering the currently secure nature of GSM where fraud scenarios, other than Subscription Fraud, have yet to manifest themselves. It is the aim of ASPeCT to be prepared for the would-be fraudster for both GSM and UMTS. Introduction When a mobile originated phone call is made or various inter-call criteria are met the cells or switches that a mobile phone is communicating with produce information pertaining to the call attempt. These data records, for billing purposes, are referred to as Toll Tickets. Toll Tickets contain a wealth of information about the call so that charges can be made to the subscriber. By considering well studied fraud indicators these records can also be used to detect fraudulent activity. By this we mean interrogating a series of recent Toll Tickets and comparing a function of the various fields with fixed criteria, known as triggers. A trigger, if activated, raises an alert status which cumulatively would lead to investigation by the network operator. Some example fraud indicators are that of a new subscriber making long back-to-back international calls being indicative of direct call selling or short back-to-back calls to a single land number indicating an attack on a PABX system. Sometimes geographical information deduced from the cell sites visited in a call can indicate cloning. This can be detected through setting a velocity trap. Fixed trigger criteria can be set to catch such extremes of activity, but these absolute usage criteria cannot trap all types of fraud. An alternative approach to the problem is to perform a differential analysis. Here we develop behaviour profiles relating to the mobile phone’s activity and compare its most recent activities with a longer history of its usage. Techniques can then be derived to determine when the mobile phone's behaviour changes significantly. One of the most common indicators of fraud is a significant change in behaviour. The performance expectations of such a system must be of prime concern when developing any fraud detection strategy. To implement a real time fraud detection tool on the Vodafone network in the U.K, it was estimated that, on average, the system would need to be able to process around 38 Toll Tickets per second. This figure varied with peak and off-peak usage and also had seasonal trends. The distribution of the times that calls are made and the duration of each call is highly skewed. Considering all calls that are made in the U.K., including the use of supplementary services, we found the average call duration to be less than eight seconds, hardly time to order a pizza. In this paper we present one of the methods developed under ASPeCT that tackles the problem of skewed distributions and seasonal trends using a recurrent neural network technique that is based around unsupervised learning. We envisage this technique would form part of a larger fraud detection suite that also comprises a rule based fraud detection tool and a neural network fraud detection tool that uses supervised learning on a multi-layer perceptron. Each of the systems has its strengths and weaknesses but we anticipate that the hybrid system will combine their strengths. The following section discusses in more detail the concept of behaviour profiling for the purposes of performing a differential analysis. This is followed, in section 3, by the neural network prototyping technique and the way these prototypes are used to generate behaviour profiles. In section 4 we describe the workings of the fraud engine and follow up with some preliminary results. Lastly we discuss how we cater for changing distributions. Behaviour profiling For a differential analysis we need information about the mobile phone’s history of behaviour plus a more recent sample of the mobile phone’s activities. An initial attempt might be to extract heuristic information from the Toll Tickets and store it in record format. For this simple scenario we would need to consider two windows or time spans over the sequence of transactions for each user. The shorter sequence could be called the Current Behaviour Profile (CBP) and the longer sequence the Behaviour Profile History (BPH). Both profiles could be treated as finite length queues. When a new Toll Ticket arrives, relating to a given user, the oldest entry from the BPH would be discarded and the oldest entry from the CBP would move to the back of the BPH queue. The new record encoded from the incoming Toll Ticket would then join the back of the CBP queue. Clearly, in practice, it is not optimal to search and retrieve a history of transaction records from a database prior to each calculation on receipt of a new Toll Ticket. Instead we compute a single behaviour profile record which we store in a database using the International Mobile Subscription Identity (IMSI) as the primary key. As a new Toll Ticket arrives, for a particular subscriber, the profile record is simply updated with information reduced from the Toll Ticket. In order to preserve the concept of two different time spans over the Toll Tickets, we will need to decay the influence of previous Toll Tickets, on the profiles, before adding in information from the new one. By applying two different decay factors we can maintain the concept of a CBP and a BPH. Of course we have to be careful not to dilute information by applying a decay factor and thus introducing false information to the behaviour profile. The following section describes a prototyping technique, based on the Second Maximal Entropy Principle by Grabec[1] which enables us to construct statistical behaviour profiles that are simple to decay without introducing false behaviour patterns. Prototyping Prototyping is a method of forming an optimal discrete representation of a naturally continuous random variable. The processing of continuous random variables by discrete systems generally reduces empirical information. Neural Networks are capable of forming optimal discrete representations of continuous random variables through their ability to converge, by lateral interaction, to stable uniformly distributed states, von der Malsburg[4]. Grabec[1] introduces a technique to dynamically generate prototypical values to span a continuous random variable as samples are taken from it. He also suggests an extension to generate prototypes for multi dimensional random variables. In its simplest form his method resembles the more well known self organisation technique developed by Kohonen 1989 [3]. However, Grabec's method does not restrict the prototypes to lie on a two dimensional manifold, but allows them to form their own topology. Only one pass through the training set is required giving rise to the potential for online adaption. Grabec introduced the second maximal entropy principle stating that The mapping of a continuous random variable X into a set of K discrete prototypes Q reduces the empirical information by the least amount if a uniform distribution { P( ) , ... q i K i K = = 1 1 }, corresponding to the absolute maximum ( S K Q = log ) of information entropy, is assigned to Q. When considering the set of all possible Toll Tickets, we clearly have a dimension to represent every parameter, from a Toll Ticket, that we wish to include in the analysis. Each parameter of a Toll Ticket can assume a range of values and is thus itself a random variable. Grabec’s technique enables us to create a number of prototypes that dynamically and uniformly span the set of samples from a download of Toll Tickets taken from a live network. Owing to the fact that there are so few fraudulent Toll Tickets in comparison to non fraudulent ones, in a live network download, the prototypes will organize themselves as if the data were totally fraud free. The resulting set of prototypes will enable us to classify future incoming Toll Tickets with minimal loss of empirical information. To distribute the prototypes over the input stream of Toll Tickets, we set up an iterative procedure that computes the change in the current value of the K prototypes Q ∆ ∆ q B C q lm i lm k l K
منابع مشابه
Detecting Corporate Financial Fraud using Beneish M-Score Model
Detecting financial fraud is an important issue and ignoring this issue may cause financial and non-financial losses to individuals and organizations. The aim of this study is to test the ability of Beneish M-Score Model for detecting financial fraud among companies listed on Tehran stock exchange. The research sample consists of 137 companies listed on Tehran Stock Exchange for a period of 11 ...
متن کاملProviding a Model for Detecting Tax Fraud Based on the Personality Types of Corporate Financial Managers using the Neural Network Approach
One of the management measures to reduce tax liabilities is non-payment of taxes through tax fraud. Because personality factors may play a role in explaining tax ethics, examining personality traits and aspects of tax fraud can help to better understand the factors that influence tax decisions. The main purpose of this study is to provide a model for detecting tax fraud based on the personality...
متن کاملDetecting Suspicious Card Transactions in unlabeled data of bank Using Outlier Detection Techniqes
With the advancement of technology, the use of ATM and credit cards are increased. Cyber fraud and theft are the kinds of threat which result in using these Technologies. It is therefore inevitable to use fraud detection algorithms to prevent fraudulent use of bank cards. Credit card fraud can be thought of as a form of identity theft that consists of an unauthorized access to another person's ...
متن کاملPresenting a framework for detecting fraud risk factors affecting fraud occurrence in banks (Case study: Resalat Banks in Isfahan, Iran)
The present study aimed to investigate fraud risk factors affecting fraud occurrence in the branches of Resalat Bank in Isfahan, Iran, in 2017. The study is an applied research as far as the purpose is concerned, and a descriptive survey study as far as the procedures for data collection are concerned. The population of the study comprised experts in accounting computer information system, expe...
متن کاملAutomated Design of User Profiling Systems for Fraud Detection
One method for detecting fraud is to check for suspicious changes in user behavior over time. This paper describes the automatic design of user profiling methods for the purpose of fraud detection, using a series of data mining and machine learning techniques. It uses a rule-learning program to uncover indicators of fraudulent behavior from a large database customer transactions. Then the indic...
متن کاملFraud Detection of Credit Cards Using Neuro-fuzzy Approach Based on TLBO and PSO Algorithms
The aim of this paper is to detect bank credit cards related frauds. The large amount of data and their similarity lead to a time consuming and low accurate separation of healthy and unhealthy samples behavior, by using traditional classifications. Therefore in this study, the Adaptive Neuro-Fuzzy Inference System (ANFIS) is used in order to reach a more efficient and accurate algorithm. By com...
متن کامل